Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
BMC Bioinformatics ; 25(1): 145, 2024 Apr 05.
Artículo en Inglés | MEDLINE | ID: mdl-38580921

RESUMEN

BACKGROUND: Drug targets in living beings perform pivotal roles in the discovery of potential drugs. Conventional wet-lab characterization of drug targets is although accurate but generally expensive, slow, and resource intensive. Therefore, computational methods are highly desirable as an alternative to expedite the large-scale identification of druggable proteins (DPs); however, the existing in silico predictor's performance is still not satisfactory. METHODS: In this study, we developed a novel deep learning-based model DPI_CDF for predicting DPs based on protein sequence only. DPI_CDF utilizes evolutionary-based (i.e., histograms of oriented gradients for position-specific scoring matrix), physiochemical-based (i.e., component protein sequence representation), and compositional-based (i.e., normalized qualitative characteristic) properties of protein sequence to generate features. Then a hierarchical deep forest model fuses these three encoding schemes to build the proposed model DPI_CDF. RESULTS: The empirical outcomes on 10-fold cross-validation demonstrate that the proposed model achieved 99.13 % accuracy and 0.982 of Matthew's-correlation-coefficient (MCC) on the training dataset. The generalization power of the trained model is further examined on an independent dataset and achieved 95.01% of maximum accuracy and 0.900 MCC. When compared to current state-of-the-art methods, DPI_CDF improves in terms of accuracy by 4.27% and 4.31% on training and testing datasets, respectively. We believe, DPI_CDF will support the research community to identify druggable proteins and escalate the drug discovery process. AVAILABILITY: The benchmark datasets and source codes are available in GitHub: http://github.com/Muhammad-Arif-NUST/DPI_CDF .


Asunto(s)
Proteínas , Programas Informáticos , Secuencia de Aminoácidos , Posición Específica de Matrices de Puntuación , Evolución Biológica , Biología Computacional/métodos
2.
BMC Genomics ; 25(1): 151, 2024 Feb 07.
Artículo en Inglés | MEDLINE | ID: mdl-38326777

RESUMEN

BACKGROUND: The mRNA subcellular localization bears substantial impact in the regulation of gene expression, cellular migration, and adaptation. However, the methods employed for experimental determination of this localization are arduous, time-intensive, and come with a high cost. METHODS: In this research article, we tackle the essential challenge of predicting the subcellular location of messenger RNAs (mRNAs) through Unified mRNA Subcellular Localization Predictor (UMSLP), a machine learning (ML) based approach. We embrace an in silico strategy that incorporate four distinct feature sets: kmer, pseudo k-tuple nucleotide composition, nucleotide physicochemical attributes, and the 3D sequence depiction achieved via Z-curve transformation for predicting subcellular localization in benchmark dataset across five distinct subcellular locales, encompassing nucleus, cytoplasm, extracellular region (ExR), mitochondria, and endoplasmic reticulum (ER). RESULTS: The proposed ML model UMSLP attains cutting-edge outcomes in predicting mRNA subcellular localization. On independent testing dataset, UMSLP ahcieved over 87% precision, 94% specificity, and 94% accuracy. Compared to other existing tools, UMSLP outperformed mRNALocator, mRNALoc, and SubLocEP by 11%, 21%, and 32%, respectively on average prediction accuracy for all five locales. SHapley Additive exPlanations analysis highlights the dominance of k-mer features in predicting cytoplasm, nucleus, ER, and ExR localizations, while Z-curve based features play pivotal roles in mitochondria subcellular localization detection. AVAILABILITY: We have shared datasets, code, Docker API for users in GitHub at: https://github.com/smusleh/UMSLP .


Asunto(s)
Retículo Endoplásmico , Mitocondrias , ARN Mensajero/genética , Mitocondrias/genética , Biología Computacional/métodos , Aprendizaje Automático , Nucleótidos
3.
ACS Omega ; 9(2): 2874-2883, 2024 Jan 16.
Artículo en Inglés | MEDLINE | ID: mdl-38250405

RESUMEN

Methicillin-resistant Staphylococcus aureus (MRSA) is a growing concern for human lives worldwide. Anti-MRSA peptides act as potential antibiotic agents and play significant role to combat MRSA infection. Traditional laboratory-based methods for annotating Anti-MRSA peptides are although precise but quite challenging, costly, and time-consuming. Therefore, computational methods capable of identifying Anti-MRSA peptides accelerate the drug designing process for treating bacterial infections. In this study, we developed a novel sequence-based predictor "iMRSAPred" for screening Anti-MRSA peptides by incorporating energy estimation and physiochemical and sequential information. We successfully resolved the skewed imbalance phenomena by using synthetic minority oversampling technique plus Tomek link (SMOTETomek) algorithm. Furthermore, the Shapley additive explanation method was leveraged to analyze the impact of top-ranked features in the prediction task. We evaluated multiple machine learning algorithms, i.e., CatBoost, Cascade Deep Forest, Kernel and Tree Boosting, support vector machine, and HistGBoost classifiers by 10-fold cross-validation and independent testing. The proposed iMRSAPred method significantly improved the overall performance in terms of accuracy and Matthew's correlation coefficient (MCC) by 5.45 and 0.083%, respectively, on the training data set. On the independent data set, iMRSAPred improved accuracy and MCC by 3.98 and 0.055%, respectively. We believe that the proposed method would be useful in large-scale Anti-MRSA peptide prediction and provide insights into other bioactive peptides.

4.
Stud Health Technol Inform ; 305: 632-635, 2023 Jun 29.
Artículo en Inglés | MEDLINE | ID: mdl-37387111

RESUMEN

Triple-negative breast cancer (TNBC) is an aggressive form of breast cancer that presents very high relapse and mortality. However, due to differences in the genetic architecture associated with TNBC, patients have different outcomes and respond differently to available treatments. In this study, we predicted the overall survival of TNBC patients in the METABRIC cohort employing supervised machine learning to identify important clinical and genetic features that are associated with better survival. We achieved a slightly higher Concordance index than the state of art and identified biological pathways related to the top genes considered important by our model.


Asunto(s)
Neoplasias de la Mama Triple Negativas , Humanos , Aprendizaje Automático , Aprendizaje Automático Supervisado , Agresión
6.
Front Public Health ; 11: 1125917, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36950105

RESUMEN

COVID-19 has taken a huge toll on our lives over the last 3 years. Global initiatives put forward by all stakeholders are still in place to combat this pandemic and help us learn lessons for future ones. While the vaccine rollout was not able to curb the spread of the disease for all strains, the research community is still trying to develop effective therapeutics for COVID-19. Although Paxlovid and remdesivir have been approved by the FDA against COVID-19, they are not free of side effects. Therefore, the search for a therapeutic solution with high efficacy continues in the research community. To support this effort, in this latest version (v3) of COVID-19Base, we have summarized the biomedical entities linked to COVID-19 that have been highlighted in the scientific literature after the vaccine rollout. Eight different topic-specific dictionaries, i.e., gene, miRNA, lncRNA, PDB entries, disease, alternative medicines registered under clinical trials, drugs, and the side effects of drugs, were used to build this knowledgebase. We have introduced a BLSTM-based deep-learning model to predict the drug-disease associations that outperforms the existing model for the same purpose proposed in the earlier version of COVID-19Base. For the very first time, we have incorporated disease-gene, disease-miRNA, disease-lncRNA, and drug-PDB associations covering the largest number of biomedical entities related to COVID-19. We have provided examples of and insights into different biomedical entities covered in COVID-19Base to support the research community by incorporating all of these entities under a single platform to provide evidence-based support from the literature. COVID-19Base v3 can be accessed from: https://covidbase-v3.vercel.app/. The GitHub repository for the source code and data dictionaries is available to the community from: https://github.com/91Abdullah/covidbasev3.0.


Asunto(s)
COVID-19 , MicroARNs , ARN Largo no Codificante , Humanos , SARS-CoV-2 , Bases del Conocimiento
7.
BMC Bioinformatics ; 24(1): 109, 2023 Mar 22.
Artículo en Inglés | MEDLINE | ID: mdl-36949389

RESUMEN

BACKGROUND: Subcellular localization of messenger RNA (mRNAs) plays a pivotal role in the regulation of gene expression, cell migration as well as in cellular adaptation. Experiment techniques for pinpointing the subcellular localization of mRNAs are laborious, time-consuming and expensive. Therefore, in silico approaches for this purpose are attaining great attention in the RNA community. METHODS: In this article, we propose MSLP, a machine learning-based method to predict the subcellular localization of mRNA. We propose a novel combination of four types of features representing k-mer, pseudo k-tuple nucleotide composition (PseKNC), physicochemical properties of nucleotides, and 3D representation of sequences based on Z-curve transformation to feed into machine learning algorithm to predict the subcellular localization of mRNAs. RESULTS: Considering the combination of the above-mentioned features, ennsemble-based models achieved state-of-the-art results in mRNA subcellular localization prediction tasks for multiple benchmark datasets. We evaluated the performance of our method  in ten subcellular locations, covering cytoplasm, nucleus, endoplasmic reticulum (ER), extracellular region (ExR), mitochondria, cytosol, pseudopodium, posterior, exosome, and the ribosome. Ablation study highlighted k-mer and PseKNC to be more dominant than other features for predicting cytoplasm, nucleus, and ER localizations. On the other hand, physicochemical properties and Z-curve based features contributed the most to ExR and mitochondria detection. SHAP-based analysis revealed the relative importance of features to provide better insights into the proposed approach. AVAILABILITY: We have implemented a Docker container and API for end users to run their sequences on our model. Datasets, the code of API and the Docker are shared for the community in GitHub at: https://github.com/smusleh/MSLP .


Asunto(s)
Algoritmos , Núcleo Celular , ARN Mensajero/genética , Ribosomas , Aprendizaje Automático , Biología Computacional/métodos
8.
Stud Health Technol Inform ; 289: 77-80, 2022 Jan 14.
Artículo en Inglés | MEDLINE | ID: mdl-35062096

RESUMEN

Acute Lymphoblastic Leukemia (ALL) is a life-threatening type of cancer wherein mortality rate is unquestionably high. Early detection of ALL can reduce both the rate of fatality as well as improve the diagnosis plan for patients. In this study, we developed the ALL Detector (ALLD), which is a deep learning-based network to distinguish ALL patients from healthy individuals based on blast cell microscopic images. We evaluated multiple DL-based models and the ResNet-based model performed the best with 98% accuracy in the classification task. We also compared the performance of ALLD against state-of-the-art tools utilized for the same purpose, and ALLD outperformed them all. We believe that ALLD will support pathologists to explicitly diagnose ALL in the early stages and reduce the burden on clinical practice overall.


Asunto(s)
Aprendizaje Profundo , Leucemia-Linfoma Linfoblástico de Células Precursoras , Humanos , Redes Neurales de la Computación
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...